Feat/better reprocess memory#298
Conversation
…290) - Updated the description for the 'asr-services' to remove the specific mention of 'Parakeet', making it more general. - Improved the console output for auto-selected services to include the transcription provider label, enhancing user feedback during service selection.
- Removed redundant Obsidian and Knowledge Graph configuration checks from services.py, streamlining the command execution process. - Updated wizard.py to enhance user experience by setting default options for speaker recognition during service selection. - Improved Neo4j password handling in setup processes, ensuring consistent configuration prompts and feedback. - Introduced a new cron scheduler for managing scheduled tasks, enhancing the backend's automation capabilities. - Added new entity annotation features, allowing for corrections and updates to knowledge graph entities directly through the API.
- Added new configuration options for VibeVoice ASR in defaults.yml, including batching parameters for audio processing. - Updated Docker Compose files to mount the config directory, ensuring access to ASR service configurations. - Enhanced the VibeVoice transcriber to load configuration settings from defaults.yml, allowing for dynamic adjustments via environment variables. - Introduced quantization options for model loading in the VibeVoice transcriber, improving performance and flexibility. - Refactored the speaker identification process to streamline audio handling and improve logging for better debugging. - Updated documentation to reflect new configuration capabilities and usage instructions for the VibeVoice ASR provider.
- Introduced functions for checking LangFuse configuration in services.py, ensuring proper setup for observability. - Updated wizard.py to facilitate user input for LangFuse configuration, including options for local and external setups. - Implemented memory reprocessing logic in memory services to update existing memories based on speaker re-identification. - Enhanced speaker recognition client to support per-segment identification, improving accuracy during reprocessing. - Refactored various components to streamline handling of LangFuse parameters and improve overall service management.
|
Caution Review failedFailed to post review comments 📝 WalkthroughWalkthroughThis pull request introduces a comprehensive system for LLM observability, prompt management, and background job orchestration. It adds a prompt registry backed by LangFuse, a config-driven cron scheduler for automated tasks, entity-level knowledge graph corrections, per-segment speaker identification, audio batching for long-form ASR, transcript reprocessing with speaker-aware memory updates, and corresponding frontend capabilities for job and annotation management. Changes
Sequence Diagram(s)sequenceDiagram
participant Client as Client
participant API as API Server
participant PromptRegistry as PromptRegistry
participant LangFuse as LangFuse
participant LLMProvider as LLMProvider
Client->>API: Request with context
API->>PromptRegistry: get_prompt(prompt_id, **variables)
PromptRegistry->>LangFuse: Fetch prompt (async)
alt LangFuse Available
LangFuse-->>PromptRegistry: Prompt template
else LangFuse Unavailable
PromptRegistry->>PromptRegistry: Use default template
end
PromptRegistry-->>API: Compiled prompt
API->>LLMProvider: Call with compiled prompt
LLMProvider-->>API: Response
API-->>Client: Result
sequenceDiagram
participant Worker as RQ Worker
participant TranscriptVersion as Transcript Version
participant MemoryService as Memory Service
participant SpeakerDiff as Speaker Diff
participant LLMProvider as LLM Provider
participant KnowledgeGraph as Knowledge Graph
Worker->>TranscriptVersion: Fetch old/new segments
Worker->>SpeakerDiff: compute_speaker_diff(old, new)
SpeakerDiff-->>Worker: Change records
Worker->>MemoryService: reprocess_memory(diff, context)
MemoryService->>LLMProvider: propose_reprocess_actions
LLMProvider-->>MemoryService: Memory updates (ADD/UPDATE/DELETE)
MemoryService->>KnowledgeGraph: Apply entity updates
KnowledgeGraph-->>MemoryService: Confirmation
MemoryService-->>Worker: Success
sequenceDiagram
participant Transcriber as VibeVoice Transcriber
participant AudioFile as Audio File
participant Batching as Batching Module
participant Batch as Single Batch
participant LLMContext as Context Manager
participant Stitching as Stitching Module
Transcriber->>AudioFile: Load duration
alt Duration > Batch Threshold
Transcriber->>Batching: split_audio_file()
Batching-->>Transcriber: List of windows with times
loop For Each Window
Transcriber->>Batch: transcribe(window, context_info)
LLMContext->>Batch: Inject hot_words + previous_text
Batch-->>Transcriber: Batch result
Transcriber->>LLMContext: extract_context_tail()
LLMContext-->>Transcriber: Context for next window
end
Transcriber->>Stitching: stitch_transcription_results()
Stitching-->>Transcriber: Merged result
else Duration <= Batch Threshold
Transcriber->>Batch: transcribe(audio, context_info)
Batch-->>Transcriber: Single result
end
Estimated code review effort🎯 4 (Complex) | ⏱️ ~60 minutes 🚥 Pre-merge checks | ✅ 2 | ❌ 1❌ Failed checks (1 inconclusive)
✅ Passed checks (2 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing touches
🧪 Generate unit tests (beta)
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
|
@coderabbitai review |
✅ Actions performedReview triggered.
|
Summary by CodeRabbit
Release Notes
New Features
Improvements
Configuration